Manhattan
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Kansas > Riley County > Manhattan (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (3 more...)
- Transportation (0.46)
- Information Technology (0.46)
The Role of Review Process Failures in Affective State Estimation: An Empirical Investigation of DEAP Dataset
Khan, Nazmun N, Sweet, Taylor, Harvey, Chase A, Knapp, Calder, Krusienski, Dean J., Thompson, David E
The reliability of affective state estimation using EEG data is in question, given the variability in reported performance and the lack of standardized evaluation protocols. To investigate this, we reviewed 101 studies, focusing on the widely used DEAP dataset for emotion recognition. Our analysis revealed widespread methodological issues that include data leakage from improper segmentation, biased feature selection, flawed hyperparameter optimization, neglect of class imbalance, and insufficient methodological reporting. Notably, we found that nearly 87% of the reviewed papers contained one or more of these errors. Moreover, through experimental analysis, we observed that such methodological flaws can inflate the classification accuracy by up to 46%. These findings reveal fundamental gaps in standardized evaluation practices and highlight critical deficiencies in the peer review process for machine learning applications in neuroscience, emphasizing the urgent need for stricter methodological standards and evaluation protocols.
- North America > United States > Kansas > Riley County > Manhattan (0.04)
- Asia > India > Tamil Nadu > Chennai (0.04)
- North America > United States > Virginia > Richmond (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Overview (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Health Care Technology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)
LSH-DynED: A Dynamic Ensemble Framework with LSH-Based Undersampling for Evolving Multi-Class Imbalanced Classification
The classification of imbalanced data streams, which have unequal class distributions, is a key difficulty in machine learning, especially when dealing with multiple classes. While binary imbalanced data stream classification tasks have received considerable attention, only a few studies have focused on multi-class imbalanced data streams. Effectively managing the dynamic imbalance ratio is a key challenge in this domain. This study introduces a novel, robust, and resilient approach to address these challenges by integrating Locality Sensitive Hashing with Random Hyperplane Projections (LSH-RHP) into the Dynamic Ensemble Diversification (DynED) framework. To the best of our knowledge, we present the first application of LSH-RHP for undersampling in the context of imbalanced non-stationary data streams. The proposed method undersamples the majority classes by utilizing LSH-RHP, provides a balanced training set, and improves the ensemble's prediction performance. We conduct comprehensive experiments on 23 real-world and ten semi-synthetic datasets and compare LSH-DynED with 15 state-of-the-art methods. The results reveal that LSH-DynED outperforms other approaches in terms of both Kappa and mG-Mean effectiveness measures, demonstrating its capability in dealing with multi-class imbalanced non-stationary data streams. Notably, LSH-DynED performs well in large-scale, high-dimensional datasets with considerable class imbalances and demonstrates adaptation and robustness in real-world circumstances. To motivate our design, we review existing methods for imbalanced data streams, outline key challenges, and offer guidance for future work. For the reproducibility of our results, we have made our implementation available on GitHub.
- Oceania > Australia (0.04)
- North America > United States > Kansas > Riley County > Manhattan (0.04)
- Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
- (8 more...)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.66)
- Education > Educational Setting > Online (1.00)
- Health & Medicine (0.67)
Supervised Models Can Generalize Also When Trained on Random Labels
Allerbo, Oskar, Schön, Thomas B.
The success of unsupervised learning raises the question of whether also supervised models can be trained without using the information in the output $y$. In this paper, we demonstrate that this is indeed possible. The key step is to formulate the model as a smoother, i.e. on the form $\hat{f}=Sy$, and to construct the smoother matrix $S$ independently of $y$, e.g. by training on random labels. We present a simple model selection criterion based on the distribution of the out-of-sample predictions and show that, in contrast to cross-validation, this criterion can be used also without access to $y$. We demonstrate on real and synthetic data that $y$-free trained versions of linear and kernel ridge regression, smoothing splines, and neural networks perform similarly to their standard, $y$-based, versions and, most importantly, significantly better than random guessing.
- North America > Canada > Ontario > Toronto (0.04)
- North America > United States > Kansas > Riley County > Manhattan (0.04)
- North America > United States > California (0.04)
- (2 more...)
- Research Report (0.64)
- Instructional Material > Course Syllabus & Notes (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)
Deep Learning for Speech Emotion Recognition: A CNN Approach Utilizing Mel Spectrograms
V alues taken from SER Classifier notebook. Next, the model was tested on unique audio from myself, family, and friends. Surprisingly, it performed well, especially with negative emotions. For example, it correctly predicted male anger with over 90% accuracy, often distinguishing it from other emotions like male disgust, female anger, and male sadness. An interesting test involved a friend with Asperger's syndrome, who struggles with recognizing emotions. While the model's accuracy seemed initially low, further analysis revealed that her own perception of emotions was misaligned with the model's predictions, which were actually more accurate. Finally, the model was tested on German and Swiss German audio, where it performed well in predicting anger, sadness, and disgust. However, it made some errors with positive emotions. In all cases of failure, the target emotion remained within the top 5 predicted classes, demonstrating the model's robustness.
- North America > United States > Kansas > Riley County > Manhattan (0.04)
- North America > United States > Alabama (0.04)
- Asia (0.04)
- Education (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.74)
Explainable identification of similarities between entities for discovery in large text
Joshi, Akhil, Erukude, Sai Teja, Shamir, Lior
With the availability of virtually infinite number text documents in digital format, automatic comparison of textual data is essential for extracting meaningful insights that are difficult to identify manually. Many existing tools, including AI and large language models, struggle to provide precise and explainable insights into textual similarities. In many cases they determine the similarity between documents as reflected by the text, rather than the similarities between the subjects being discussed in these documents. This study addresses these limitations by developing an n-gram analysis framework designed to compare documents automatically and uncover explainable similarities. A scoring formula is applied to assigns each of the n-grams with a weight, where the weight is higher when the n-grams are more frequent in both documents, but is penalized when the n-grams are more frequent in the English language. Visualization tools like word clouds enhance the representation of these patterns, providing clearer insights. The findings demonstrate that this framework effectively uncovers similarities between text documents, offering explainable insights that are often difficult to identify manually. This non-parametric approach provides a deterministic solution for identifying similarities across various fields, including biographies, scientific literature, historical texts, and more. Code for the method is publicly available.
- Europe > Germany (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- Asia > Middle East > Saudi Arabia (0.04)
- (6 more...)
- Overview (0.68)
- Personal > Honors (0.47)
- Research Report > New Finding (0.34)
- Media > Music (1.00)
- Leisure & Entertainment > Sports > Soccer (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
An Integrated Deep Learning Framework Leveraging NASNet and Vision Transformer with MixProcessing for Accurate and Precise Diagnosis of Lung Diseases
Saleem, Sajjad, Sharif, Muhammad Imran
The lungs are the essential organs of respiration, and this system is significant in the carbon dioxide and exchange between oxygen that occurs in human life. However, several lung diseases, which include pneumonia, tuberculosis, COVID-19, and lung cancer, are serious healthiness challenges and demand early and precise diagnostics. The methodological study has proposed a new deep learning framework called NASNet-ViT, which effectively incorporates the convolution capability of NASNet with the global attention mechanism capability of Vision Transformer ViT. The proposed model will classify the lung conditions into five classes: Lung cancer, COVID-19, pneumonia, TB, and normal. A sophisticated multi-faceted preprocessing strategy called MixProcessing has been used to improve diagnostic accuracy. This preprocessing combines wavelet transform, adaptive histogram equalization, and morphological filtering techniques. The NASNet-ViT model performs at state of the art, achieving an accuracy of 98.9%, sensitivity of 0.99, an F1-score of 0.989, and specificity of 0.987, outperforming other state of the art architectures such as MixNet-LD, D-ResNet, MobileNet, and ResNet50. The model's efficiency is further emphasized by its compact size, 25.6 MB, and a low computational time of 12.4 seconds, hence suitable for real-time, clinically constrained environments. These results reflect the high-quality capability of NASNet-ViT in extracting meaningful features and recognizing various types of lung diseases with very high accuracy. This work contributes to medical image analysis by providing a robust and scalable solution for diagnostics in lung diseases.
- Asia > China > Guangdong Province > Shenzhen (0.04)
- North America > United States > Virginia > Alexandria County > Alexandria (0.04)
- North America > United States > Kansas > Riley County > Manhattan (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
Building Knowledge Graphs Towards a Global Food Systems Datahub
Gelal, Nirmal, Gautam, Aastha, Norouzi, Sanaz Saki, Giordano, Nico, Silva, Claudio Dias da Jr, Francois, Jean Ribert, Onofre, Kelsey Andersen, Nelson, Katherine, Hutchinson, Stacy, Lin, Xiaomao, Welch, Stephen, Lollato, Romulo, Hitzler, Pascal, McGinty, Hande Küçük
Sustainable agricultural production aligns with several sustainability goals established by the United Nations (UN). However, there is a lack of studies that comprehensively examine sustainable agricultural practices across various products and production methods. Such research could provide valuable insights into the diverse factors influencing the sustainability of specific crops and produce while also identifying practices and conditions that are universally applicable to all forms of agricultural production. While this research might help us better understand sustainability, the community would still need a consistent set of vocabularies. These consistent vocabularies, which represent the underlying datasets, can then be stored in a global food systems datahub. The standardized vocabularies might help encode important information for further statistical analyses and AI/ML approaches in the datasets, resulting in the research targeting sustainable agricultural production. A structured method of representing information in sustainability, especially for wheat production, is currently unavailable. In an attempt to address this gap, we are building a set of ontologies and Knowledge Graphs (KGs) that encode knowledge associated with sustainable wheat production using formal logic. The data for this set of knowledge graphs are collected from public data sources, experimental results collected at our experiments at Kansas State University, and a Sustainability Workshop that we organized earlier in the year, which helped us collect input from different stakeholders throughout the value chain of wheat. The modeling of the ontology (i.e., the schema) for the Knowledge Graph has been in progress with the help of our domain experts, following a modular structure using KNARM methodology. In this paper, we will present our preliminary results and schemas of our Knowledge Graph and ontologies.
- North America > United States > Missouri > Boone County > Columbia (0.14)
- North America > United States > Oregon > Benton County > Corvallis (0.04)
- North America > United States > Kansas > Riley County > Manhattan (0.04)
A Review of Artificial Intelligence Impacting Statistical Process Monitoring and Future Directions
Chang, Shing I, Ghafariasl, Parviz
It has been 100 years since statistical process control (SPC) or statistical process monitoring (SPM) was first introduced for production processes and later applied to service, healthcare, and other industries. The techniques applied to SPM applications are mostly statistically oriented. Recent advances in Artificial Intelligence (AI) have reinvigorated the imagination of adopting AI for SPM applications. This manuscript begins with a concise review of the historical development of the statistically based SPM methods. Next, this manuscript explores AI and Machine Learning (ML) algorithms and methods applied in various SPM applications, addressing quality characteristics of univariate, multivariate, profile, and image. These AI methods can be classified into the following categories: classification, pattern recognition, time series applications, and generative AI. Specifically, different kinds of neural networks, such as artificial neural networks (ANN), convolutional neural networks (CNN), recurrent neural networks (RNN), and generative adversarial networks (GAN), are among the most implemented AI methods impacting SPM. Finally, this manuscript outlines a couple of future directions that harness the potential of the Large Multimodal Model (LMM) for advancing SPM research and applications in complex systems. The ultimate objective is to transform statistical process monitoring (SPM) into smart process control (SMPC), where corrective actions are autonomously implemented to either prevent quality issues or restore process performance.
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Tennessee > Davidson County > Nashville (0.04)
- (11 more...)
- Overview (1.00)
- Research Report > New Finding (0.92)
- Information Technology > Security & Privacy (1.00)
- Energy (1.00)
- Government (0.92)
- (3 more...)
Fine-Tuned LLMs are "Time Capsules" for Tracking Societal Bias Through Books
Madhusudan, Sangmitra, Morabito, Robert, Reid, Skye, Sadr, Nikta Gohari, Emami, Ali
Books, while often rich in cultural insights, can also mirror societal biases of their eras - biases that Large Language Models (LLMs) may learn and perpetuate during training. We introduce a novel method to trace and quantify these biases using fine-tuned LLMs. We develop BookPAGE, a corpus comprising 593 fictional books across seven decades (1950-2019), to track bias evolution. By fine-tuning LLMs on books from each decade and using targeted prompts, we examine shifts in biases related to gender, sexual orientation, race, and religion. Our findings indicate that LLMs trained on decade-specific books manifest biases reflective of their times, with both gradual trends and notable shifts. For example, model responses showed a progressive increase in the portrayal of women in leadership roles (from 8% to 22%) from the 1950s to 2010s, with a significant uptick in the 1990s (from 4% to 12%), possibly aligning with third-wave feminism. Same-sex relationship references increased markedly from the 1980s to 2000s (from 0% to 10%), mirroring growing LGBTQ+ visibility. Concerningly, negative portrayals of Islam rose sharply in the 2000s (26% to 38%), likely reflecting post-9/11 sentiments. Importantly, we demonstrate that these biases stem mainly from the books' content and not the models' architecture or initial training. Our study offers a new perspective on societal bias trends by bridging AI, literary studies, and social science research.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > New York (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- (21 more...)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Health & Medicine (1.00)
- Law > Criminal Law (0.92)
- (2 more...)